330 research outputs found
Motion Switching with Sensory and Instruction Signals by designing Dynamical Systems using Deep Neural Network
To ensure that a robot is able to accomplish an extensive range of tasks, it
is necessary to achieve a flexible combination of multiple behaviors. This is
because the design of task motions suited to each situation would become
increasingly difficult as the number of situations and the types of tasks
performed by them increase. To handle the switching and combination of multiple
behaviors, we propose a method to design dynamical systems based on point
attractors that accept (i) "instruction signals" for instruction-driven
switching. We incorporate the (ii) "instruction phase" to form a point
attractor and divide the target task into multiple subtasks. By forming an
instruction phase that consists of point attractors, the model embeds a subtask
in the form of trajectory dynamics that can be manipulated using sensory and
instruction signals. Our model comprises two deep neural networks: a
convolutional autoencoder and a multiple time-scale recurrent neural network.
In this study, we apply the proposed method to manipulate soft materials. To
evaluate our model, we design a cloth-folding task that consists of four
subtasks and three patterns of instruction signals, which indicate the
direction of motion. The results depict that the robot can perform the required
task by combining subtasks based on sensory and instruction signals. And, our
model determined the relations among these signals using its internal dynamics.Comment: 8 pages, 6 figures, accepted for publication in RA-L. An accompanied
video is available at this https://youtu.be/a73KFtOOB5
Toward Abstraction from Multi-modal Data: Empirical Studies on Multiple Time-scale Recurrent Models
The abstraction tasks are challenging for multi- modal sequences as they
require a deeper semantic understanding and a novel text generation for the
data. Although the recurrent neural networks (RNN) can be used to model the
context of the time-sequences, in most cases the long-term dependencies of
multi-modal data make the back-propagation through time training of RNN tend to
vanish in the time domain. Recently, inspired from Multiple Time-scale
Recurrent Neural Network (MTRNN), an extension of Gated Recurrent Unit (GRU),
called Multiple Time-scale Gated Recurrent Unit (MTGRU), has been proposed to
learn the long-term dependencies in natural language processing. Particularly
it is also able to accomplish the abstraction task for paragraphs given that
the time constants are well defined. In this paper, we compare the MTRNN and
MTGRU in terms of its learning performances as well as their abstraction
representation on higher level (with a slower neural activation). This was done
by conducting two studies based on a smaller data- set (two-dimension time
sequences from non-linear functions) and a relatively large data-set
(43-dimension time sequences from iCub manipulation tasks with multi-modal
data). We conclude that gated recurrent mechanisms may be necessary for
learning long-term dependencies in large dimension multi-modal data-sets (e.g.
learning of robot manipulation), even when natural language commands was not
involved. But for smaller learning tasks with simple time-sequences, generic
version of recurrent models, such as MTRNN, were sufficient to accomplish the
abstraction task.Comment: Accepted by IJCNN 201
Collaboration Development through Interactive Learning between Human and Robot
In this paper, we investigated interactive learning between human subjects and robot experimentally, and its essential characteristics are examined using the dynamical systems approach. Our research concentrated on the navigation system of a specially developed humanoid robot called Robovie and seven human subjects whose eyes were covered, making them dependent on the robot for directions. We compared the usual feed-forward neural network (FFNN) without recursive connections and the recurrent neural network (RNN). Although the performances obtained with both the RNN and the FFNN improved in the early stages of learning, as the subject changed the operation by learning on its own, all performances gradually became unstable and failed. Results of a questionnaire given to the subjects confirmed that the FFNN gives better mental impressions, especially from the aspect of operability. When the robot used a consolidation-learning algorithm using the rehearsal outputs of the RNN, the performance improved even when interactive learning continued for a long time. The questionnaire results then also confirmed that the subject's mental impressions of the RNN improved significantly. The dynamical systems analysis of RNNs support these differences and also showed that the collaboration scheme was developed dynamically along with succeeding phase transitions
Tool-Use Model to Reproduce the Goal Situations Considering Relationship Among Tools, Objects, Actions and Effects Using Multimodal Deep Neural Networks
We propose a tool-use model that enables a robot to act toward a provided goal. It is important to consider features of the four factors; tools, objects actions, and effects at the same time because they are related to each other and one factor can influence the others. The tool-use model is constructed with deep neural networks (DNNs) using multimodal sensorimotor data; image, force, and joint angle information. To allow the robot to learn tool-use, we collect training data by controlling the robot to perform various object operations using several tools with multiple actions that leads different effects. Then the tool-use model is thereby trained and learns sensorimotor coordination and acquires relationships among tools, objects, actions and effects in its latent space. We can give the robot a task goal by providing an image showing the target placement and orientation of the object. Using the goal image with the tool-use model, the robot detects the features of tools and objects, and determines how to act to reproduce the target effects automatically. Then the robot generates actions adjusting to the real time situations even though the tools and objects are unknown and more complicated than trained ones
Stable deep reinforcement learning method by predicting uncertainty in rewards as a subtask
In recent years, a variety of tasks have been accomplished by deep
reinforcement learning (DRL). However, when applying DRL to tasks in a
real-world environment, designing an appropriate reward is difficult. Rewards
obtained via actual hardware sensors may include noise, misinterpretation, or
failed observations. The learning instability caused by these unstable signals
is a problem that remains to be solved in DRL. In this work, we propose an
approach that extends existing DRL models by adding a subtask to directly
estimate the variance contained in the reward signal. The model then takes the
feature map learned by the subtask in a critic network and sends it to the
actor network. This enables stable learning that is robust to the effects of
potential noise. The results of experiments in the Atari game domain with
unstable reward signals show that our method stabilizes training convergence.
We also discuss the extensibility of the model by visualizing feature maps.
This approach has the potential to make DRL more practical for use in noisy,
real-world scenarios.Comment: Published as a conference paper at ICONIP 202
Symbol Emergence in Robotics: A Survey
Humans can learn the use of language through physical interaction with their
environment and semiotic communication with other people. It is very important
to obtain a computational understanding of how humans can form a symbol system
and obtain semiotic skills through their autonomous mental development.
Recently, many studies have been conducted on the construction of robotic
systems and machine-learning methods that can learn the use of language through
embodied multimodal interaction with their environment and other systems.
Understanding human social interactions and developing a robot that can
smoothly communicate with human users in the long term, requires an
understanding of the dynamics of symbol systems and is crucially important. The
embodied cognition and social interaction of participants gradually change a
symbol system in a constructive manner. In this paper, we introduce a field of
research called symbol emergence in robotics (SER). SER is a constructive
approach towards an emergent symbol system. The emergent symbol system is
socially self-organized through both semiotic communications and physical
interactions with autonomous cognitive developmental agents, i.e., humans and
developmental robots. Specifically, we describe some state-of-art research
topics concerning SER, e.g., multimodal categorization, word discovery, and a
double articulation analysis, that enable a robot to obtain words and their
embodied meanings from raw sensory--motor information, including visual
information, haptic information, auditory information, and acoustic speech
signals, in a totally unsupervised manner. Finally, we suggest future
directions of research in SER.Comment: submitted to Advanced Robotic
Interactively Robot Action Planning with Uncertainty Analysis and Active Questioning by Large Language Model
The application of the Large Language Model (LLM) to robot action planning
has been actively studied. The instructions given to the LLM by natural
language may include ambiguity and lack of information depending on the task
context. It is possible to adjust the output of LLM by making the instruction
input more detailed; however, the design cost is high. In this paper, we
propose the interactive robot action planning method that allows the LLM to
analyze and gather missing information by asking questions to humans. The
method can minimize the design cost of generating precise robot instructions.
We demonstrated the effectiveness of our method through concrete examples in
cooking tasks. However, our experiments also revealed challenges in robot
action planning with LLM, such as asking unimportant questions and assuming
crucial information without asking. Shedding light on these issues provides
valuable insights for future research on utilizing LLM for robotics.Comment: 7 pages, 6 figures, accepted at SII 202
- …